Prosodic annotation in the new corpus of Russian spontaneous speech CoRuSS
نویسندگان
چکیده
This paper deals with intonation of spontaneous Russian. It contains a description of the principles of prosodic annotation used in the new corpus of spontaneous speech—CoRuSS, and statistical data derived from this corpus. The prosodic annotation system was developed specially for the purpose; it is an extension and development of a well-known system of Intonation Constructions by E. A. Bryzgunova (7 ICs). Originally intended for teaching Russian as L2, Bryzgunova system proved to be insufficient for a detailed and adequate description of spoken Russian speech intonation. The results of the study provide statistical data on the frequency of particular intonation patterns of spontaneous Russian speech and form the basis for comparison with existing data on Russian read speech intonation; they confirm previously obtained information about new tendencies in the realization of Russian non-final and question intonation by young Russian native speakers and allow us to compare the realization and frequency of particular intonation patterns across other age-groups of native Russian speakers.
منابع مشابه
CoRuSS - a New Prosodically Annotated Corpus of Russian Spontaneous Speech
This paper describes speech data recording, processing and annotation of a new speech corpus CoRuSS (Corpus of Russian Spontaneous Speech), which is based on connected communicative speech recorded from 60 native Russian male and female speakers of different age groups (from 16 to 77). Some Russian speech corpora available at the moment contain plain orthographic texts and provide some kind of ...
متن کاملA Fully Annotated Corpus of Russian Speech
The paper introduces CORPRES – a fully annotated Russian speech corpus developed at the Department of Phonetics, St. Petersburg State University as a result of a three-year project. The corpus includes samples of different speaking styles produced by 4 male and 4 female speakers. Six levels of annotation cover all phonetic and prosodic information about the recorded speech data, including label...
متن کاملDesign and Evaluation of Shared Prosodic Annotation for Spontaneous French Speech: From Expert Knowledge to Non-Expert Annotation
In the area of large French speech corpora, there is a demonstrated need for a common prosodic notation system allowing for easy data exchange, comparison, and automatic annotation. The major questions are: (1) how to develop a single simple scheme of prosodic transcription which could form the basis of guidelines for non-expert manual annotation (NEMA), used for linguistic teaching and researc...
متن کاملProsody in a corpus of French spontaneous speech: perception, annotation and prosody ~ syntax interaction
Our study focuses on the issue of prosodic annotation and of the prosody ~ syntax interface in conversation and is based on a large corpus of conversational speech in French. The results of inter-transcriber agreement tests show that two expert transcribers are consistent in their labeling of prosodic phrasing and the consistency is well above the chance. A qualitative analysis reveals transcri...
متن کاملSpontaneous Speech in the Spoken Dutch Corpus
In this paper the Spoken Dutch Corpus project is presented, a joint Flemish-Dutch undertaking aimed at the compilation and annotation of a corpus of 1,000 hours of spoken Dutch. Upon completion, the corpus will constitute a valuable resource for research in the fields of (computational) linguistics and language and speech technology. Although the corpus will contain a fair amount of read speech...
متن کامل